{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problems"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 11-1 Longest-probe bound for hashing\n",
    "\n",
    "> Suppose that we use an open-addressed hash table of size $m$ to store $n \\le m/2$ items."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*a*__. Assuming uniform hashing, show that for $i=1,2,\\dots,n$, the probability is at most $2^{-k}$ that the $i$th insertion requires strictly more than $k$ probes."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{array}{rll}\n",
    "\\displaystyle \\text{Pr}\\{X_i > k\\} &=& \n",
    "\\displaystyle \\frac{n}{m} \\cdot \\frac{n - 1}{m - 1} \\cdots \\frac{n - k + 1}{m - k + 1} \\\\\n",
    "&\\le& \\displaystyle \\left ( \\frac{n}{m} \\right ) ^ {k} \\\\\n",
    "&\\le& \\displaystyle \\left ( \\frac{1}{2} \\right ) ^ {k} \\\\\n",
    "&=& 2^{-k}\n",
    "\\end{array}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*b*__. Show that for $i=1,2,\\dots,n$, the probability is $O(1/n^2)$ that the $i$th insertion requires more than $2\\lg n$ probes."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\displaystyle \\text{Pr}\\{X_i > 2\\lg n\\} \\le 2^{-2 \\lg n} = 1/n^2 = O(1/n^2)\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Let the random variable $X_i$ denote the number of probes required by the $i$th insertion. You have shown in part (b) that $\\text{Pr}\\{X_i > 2\\lg n\\} = O(1/n^2)$. Let the random variable $X = \\max_{1 \\le i \\le n} X_i$ denote the maximum number of probes required by any of the $n$ insertions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*c*__. Show that $\\text{Pr}\\{ X > 2\\lg n\\}=O(1/n)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\text{Pr}\\{ X > 2\\lg n\\}\\le\\sum_{i=1}^n 1/n^2 = 1/n =O(1/n)\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*d*__. Show that the expected length $\\text{E}[X]$ of the longest probe sequence is $O(\\lg n)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{array}{rll}\n",
    "\\text{E}[X] \n",
    "&=& \\displaystyle \\sum_{i=1}^n i \\cdot \\text{Pr}\\{X = i\\} \\\\\n",
    "&=& \\displaystyle \\sum_{i=1}^{2 \\lg n} i \\cdot \\text{Pr}\\{X = i\\} + \\sum_{i=2 \\lg n + 1}^n i \\cdot \\text{Pr}\\{X = i\\} \\\\ \n",
    "&\\le& \\displaystyle 2 \\lg n \\cdot \\sum_{i=1}^{2 \\lg n} \\text{Pr}\\{X = i\\} + n \\cdot \\sum_{i=2 \\lg n + 1}^n \\text{Pr}\\{X = i\\} \\\\ \n",
    "&\\le& \\displaystyle 2 \\lg n \\cdot 1 + n \\cdot 1/n \\\\ \n",
    "&=& \\displaystyle 2 \\lg n + 1 \\\\ \n",
    "&=& O(\\lg n)\n",
    "\\end{array}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 11-2 Slot-size bound for chaining\n",
    "\n",
    "> Suppose that we have a hash table with $n$ slots, with collisions resolved by chaining, and suppose that $n$ keys are inserted into the table. Each key is equally likely to be hashed to each slot. Let $M$ be the maximum number of keys in any slot after all the keys have been inserted. Your mission is to prove an $O(\\lg n/\\lg\\lg n)$ upper bound on $\\text{E}[M]$, the expected value of $M$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*a*__. Argue that the probability $Q_k$ that exactly $k$ keys hash to a particular slot is given by\n",
    "\n",
    "> $\\displaystyle Q_k = \\left ( \\frac{1}{n} \\right ) ^ k \\left ( 1 - \\frac{1}{n} \\right ) ^ {n - k} \\binom{n}{k}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Obviously."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*b*__. Let $P_k$ be the probability that $M = k$, that is, the probability that the slot containing the most keys contains $k$ keys. Show that $P_k \\le n Q_k$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$nQ_k$ is the probability that at least one slot contains $k$ keys, thus $P_k \\le nQ_k$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*c*__. Use Stirling's approximation, equation (3.18), to show that $Q_k < e^k / k^k$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{array}{rll}\n",
    "Q_k &=& \\displaystyle Q_k = \\left ( \\frac{1}{n} \\right ) ^ k \\left ( 1 - \\frac{1}{n} \\right ) ^ {n - k} \\binom{n}{k} \\\\\n",
    "&=& \\displaystyle \\frac{(n-1)^{n-k}}{n^n} \\cdot \\frac{n \\cdot (n-1) \\cdots (n - k + 1)}{k!} \\\\\n",
    "&\\le& \\displaystyle \\frac{(n-1)^{n-k}}{n^n} \\cdot \\frac{n^k}{\\sqrt{2 \\pi k} \\left ( \\frac{k}{e} \\right ) ^ k} \\\\\n",
    "&=& \\displaystyle \\left ( \\frac{n - 1}{n} \\right ) ^ {n - k} \\cdot \\frac{(e/k)^k}{\\sqrt{2 \\pi k}} \\\\\n",
    "&\\le& e^k / k^k\n",
    "\\end{array}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*d*__. Show that there exists a constant $c>1$ such that $Q_{k_0} < 1/n^3$ for $k_0 = c \\lg n / \\lg \\lg n$. Conclude that $P_k < 1/n^2$ for $k \\ge k_0 = c\\lg n/ \\lg \\lg n$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{array}{rll}\n",
    "\\lg Q_{k_0} \n",
    "&=& \\displaystyle \\frac{c \\lg n (\\lg e - \\lg c)}{\\lg \\lg n} + c\\lg n \\cdot \\left ( \\frac{\\lg\\lg\\lg n}{\\lg \\lg n} - 1 \\right )\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "The maximum of $\\displaystyle\\frac{\\lg\\lg\\lg n}{\\lg \\lg n}$ is $\\displaystyle \\frac{1}{e \\log(2)} \\approx 0.5307$, and converge to $0$ when $n \\rightarrow \\infty$. For a large $n$, if $c > 3$, the first term is negative and\n",
    "\n",
    "$$\n",
    "\\lg Q_{k_0} < -3\\lg n = \\lg \\frac{1}{n^3}\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\therefore Q_{k_0} < \\frac{1}{n^3}\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\therefore P_k \\le n Q_k < \\frac{1}{n^2}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*e*__. Argue that\n",
    "\n",
    "> $\\displaystyle \\text{E}[M] \\le \\text{Pr} \\left \\{ M > \\frac{c \\lg n}{\\lg \\lg n} \\right \\} \\cdot n + \\text{Pr} \\left \\{ M \\le \\frac{c \\lg n}{\\lg \\lg n} \\right \\} \\cdot \\frac{c \\lg n}{\\lg \\lg n}$.\n",
    "\n",
    "> Conclude that $E[M] = O(\\lg n/ \\lg \\lg n)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{array}{rll}\n",
    "\\text{E}[M]\n",
    "&=& \\displaystyle \\sum_{i=1}^n \\text{Pr}\\{M=i\\} \\cdot i\\\\\n",
    "&=& \\displaystyle \\sum_{i=1}^{\\displaystyle \\frac{c\\lg n}{\\lg \\lg n}} \\text{Pr}\\{M=i\\} \\cdot i + \\sum_{\\displaystyle i=\\frac{c\\lg n}{\\lg \\lg n}+1}^{n} \\text{Pr}\\{M=i\\} \\cdot i \\\\\n",
    "&\\le& \\displaystyle \\sum_{i=1}^{\\displaystyle \\frac{c\\lg n}{\\lg \\lg n}} \\text{Pr}\\{M=i\\} \\cdot \\frac{c\\lg n}{\\lg \\lg n} + \\sum_{\\displaystyle i=\\frac{c\\lg n}{\\lg \\lg n}+1}^{n} \\text{Pr}\\{M=i\\} \\cdot n \\\\\n",
    "&=& \\displaystyle \\text{Pr} \\left \\{ M \\le \\frac{c \\lg n}{\\lg \\lg n} \\right \\} \\cdot \\frac{c \\lg n}{\\lg \\lg n} + \\text{Pr} \\left \\{ M > \\frac{c \\lg n}{\\lg \\lg n} \\right \\} \\cdot n \\\\\n",
    "&<& \\displaystyle 1 \\cdot \\frac{c \\lg n}{\\lg \\lg n} + \\frac{1}{n^2} \\cdot n \\\\\n",
    "&=& \\displaystyle \\frac{c \\lg n}{\\lg \\lg n} + \\frac{1}{n} \\\\\n",
    "&=& \\displaystyle O(\\lg n/ \\lg \\lg n) \\\\\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\left ( \\lim_{n \\rightarrow \\infty} \\frac{\\frac{1}{n}}{\\frac{\\lg n}{\\lg \\lg n}} = \\lim_{n \\rightarrow \\infty} \\frac{\\lg (\\lg n)}{\\lg (n^n)} = 0 \\right )\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 11-3 Quadratic probing\n",
    "\n",
    "> Suppose that we are given a key $k$ to search for in a hash table with positions $0,1,\\dots, m-1$, and suppose that we have a hash function $h$ mapping the key space into the set $\\{0,1,\\dots,m-1\\}$. The search scheme is as follows:\n",
    "\n",
    "> 1. Compute the value $j=h(k)$, and set $i=0$.\n",
    "> 2. Probe in position $j$ for the desired key $k$. If you find it, or if this position is empty, terminate the search.\n",
    "> 3. Set $i = i + 1$. If $i$ now equals $m$, the table is full, so terminate the search. Otherwise, set $j = (i + j) ~\\text{mod}~ m$, and return to step 2.\n",
    "\n",
    "> Assume that $m$ is a power of 2."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*a*__. Show that this scheme is an instance of the general \"quadratic probing\" scheme by exhibiting the appropriate constants $c_1$ and $c_2$ for equation (11.5)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The $i$th probing is equivalent to $(j + \\frac{i(i+1)}{2}) ~\\text{mod}~ m$, thus $c_1 = 1/2$, $c_2 = 1/2$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*b*__. Prove that this algorithm examines every table position in the worst case."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Suppose there are two probing $i$ and $j$, and $0 \\le j < i < m$.\n",
    "\n",
    "Suppose the two probing examine the same position, then:\n",
    "\n",
    "$$\n",
    "\\begin{array}{rlll}\n",
    "(i + i^2) - (j + j^2) &=& 2km &(k \\ge 0) \\\\\n",
    "(i+j+1)(i-j) &=& 2km\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "Since $i > j$, then $k \\ne 0$.\n",
    "\n",
    "Note that $m$ is a power of 2.\n",
    "\n",
    "If $i$ and $j$ are both even or both odd, then only $(i-j)$ could be even, since $i < m$, $(i - j) < m < 2m$, thus $2m$ could not be a factor of $(i - j)$.\n",
    "\n",
    "If one of $i$ and $j$ is odd and the other is even, then only $(i + j + 1)$ could be even, since $i < m$, $(i + j + 1) < 2m$, thus $2m$ could not be a factor of $(i + j + 1)$.\n",
    "\n",
    "Therefore $i$ and $j$ could not probing the same position, this algorithm examines every table position in the worst case."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 11-4 Hashing and authentication\n",
    "\n",
    "> Let $\\mathcal{H}$ be a class of hash functions in which each hash function $h \\in \\mathcal{H}$ maps the universe $U$ of keys to $\\{ 0, 1, \\dots, m - 1\\}$. We say that $\\mathcal{H}$ is __*k-universal*__ if, for every fixed sequence of $k$ distinct keys $\\langle x^{(1)}, x^{(2)}, \\dots, x^{(k)} \\rangle$ and for any $h$ chosen at random from $\\mathcal{H}$, the sequence $\\langle h(x^{(1)}), h(x^{(2)}), \\dots, h(x^{(k)}) \\rangle$ is equally likely to be any of the $m^k$ sequences of length $k$ with elements drawn from $\\{ 0, 1, \\dots, m - 1 \\}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*a*__. Show that if the family $\\mathcal{H}$ of hash functions is 2-universal, then it is universal."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The number of hash functions for which $h(k)=h(l)$ is $\\frac{m}{m^2}|\\mathcal{H}|=\\frac{1}{m}|\\mathcal{H}|$, therefore the family is universal."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*b*__. Suppose that the universe $U$ is the set of $n$-tuples of values drawn from $\\mathbb{Z}_p = \\{ 0, 1, \\dots, p - 1 \\}$, where $p$ is prime. Consider an element $x = \\langle x_0, x_1, \\dots, x_{n-1} \\rangle \\in U$. For any $n$-tuple $a = \\langle a_0, a_1, \\dots, a_{n-1} \\rangle \\in U$, define the hash function $h_a$ by\n",
    "\n",
    "> $\\displaystyle h_a(x) = \\left ( \\sum_{j=0}^{n-1} a_j x_j \\right ) ~\\text{mod}~ p$.\n",
    "\n",
    "> Let $\\mathcal{H}=\\{h_a\\}$. Show that $\\mathcal{H}$ is universal, but not 2-universal."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For $x = \\langle 0, 0, \\dots, 0 \\rangle$, $\\mathcal{H}$ could not be 2-universal."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*c*__. Suppose that we modify $\\mathcal{H}$ slightly from part (b): for any $a \\in U$ and for any $b \\in \\mathbb{Z}_p$, define\n",
    "\n",
    "> $\\displaystyle h'_{ab}(x)=\\left ( \\sum_{j=0}^{n-1} a_j x_j \\right ) ~\\text{mod}~ p$\n",
    "\n",
    "> and $\\mathcal{H}'=\\{h'_{ab}\\}$. Argue that $\\mathcal{H}'$ is 2-universal."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*d*__. Suppose that Alice and Bob secretly agree on a hash function $h$ form 2-universal family $\\mathcal{H}$ of hash functions. Each $h \\in \\mathcal{H}$ maps from a universe of keys $u$ to $\\mathbb{Z}_p$, where $p$ is aprime. Later, Alice sends a message $m$ to Bob over the Internet, where $m \\in U$. She authenticates this message to Bob by also sending an authentication tag $t = h(m)$, and Bob checks that the pair $(m, t)$ he receives indeed satisfies $t = h(m)$. Suppose that an adversary intercepts $(m, t)$ en route and tries to fool Bob by replacing the pair $(m, t)$ with a different pair $(m', t')$. Argue that the probability that the adversary succeeds in fooling Bob into accepting $(m', t')$ is at most $1/p$, no matter how much computing power the adversary has, and even if the adversary knows the family $\\mathcal{H}$ of hash functions used."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since $\\mathcal{H}$ is 2-universal, every pair of $\\langle t, t' \\rangle$ is equally likely to appear, thus $t'$ could be any value from $\\mathbb{Z}_p$. Even the adversary knows $\\mathcal{H}$, since $\\mathcal{H}$ is 2-universal, then $\\mathcal{H}$ is universal, the probability of choosing a hash function that $h(k)=h(l)$ is at most $1/p$, therefore the probability is at most $1/p$."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
